| 3 Manage, Design, Deploy

Problem Details (RFC 9457): Doing API Errors Well

May 02, 2024

HTTP APIs play a critical role in orchestrating the seamless exchange of value across software applications. These invisible highways of data are fundamental to the services we use daily, from social media platforms to mobile banking apps. Yet, as with any form of communication, clarity and structure are paramount, more so when conveying "bad news", such as errors. Unfortunately, effective error communication in API interactions is an area often overlooked, leading to confusion, frustration, and inefficiency in diagnosing and resolving issues.

This is part I in a two-part series on Problem Details. For part II, click here.

Delivering Bad News Effectively

The essence of delivering bad news—like API errors—effectively lies not just in the delivery itself, but in ensuring the message is structured, informative, and, most importantly, actionable. Historically, the way APIs communicated errors varied widely, leading to a Wild West of error formats, each with its assumptions and idiosyncrasies. This inconsistency not only made error handling a nightmare for developers, but also hampered interoperability between different systems. The challenges increase as we navigate this maze of ever-multiplying APIs. The lack of standardization in error responses isn't just a minor inconvenience; it's a roadblock. It stretches the meantime to integration, ramps up integration costs, and turns ongoing maintenance into a real total cost of ownership (TCO) concern. And with the new wave of API consumers—think AI bots and models—this challenge only magnifies. These consumers demand precision and clarity the current babel of error messages can't provide.

Recognizing this pain point, the Internet Engineering Task Force (IETF) introduced RFC 7807 [1], "Problem Details for HTTP APIs," creating a standard for expressing errors in a more structured and helpful way.

As the digital landscape ever evolves, standards must adapt to keep pace with new challenges and insights. Thus, RFC 7807 has been succeeded by RFC 9457 [2], marking a significant evolution in how we communicate API errors. This update not only refines the original standard, but also introduces new features to enhance error reporting further. In essence, RFC 9457 aims to perfect the art of delivering bad news, ensuring API errors are not just communicated but are done so in a way that negates assumptions, aids in quick diagnostics, and facilitates smoother interactions between machines.

As we delve into the advancements brought by RFC 9457, it's essential to understand why evolving our approach to API error reporting matters. It's not just about making life easier for developers; it's about creating more resilient, understandable, and user-friendly digital services.

API Error Handling Anti-Patterns

Despite its importance, error handling often becomes an afterthought, leading to a range of anti-patterns that complicate API consumption and integration.

Here are some common anti-patterns that arise:

Not Providing Useful Error Feedback: One of the fundamental expectations from any API is it will guide consumers through errors by providing meaningful feedback. When APIs fail to deliver useful error messages, it leaves developers in the dark, forcing them to rely on guesswork, trial and error, and debugging to understand what went wrong. This not only slows down development but also increases the integration time and costs.
Inventing Custom Ways of Communicating Errors: We want creativity in how providers solve problems with APIs, but perhaps not in how they communicate errors. Inventing unique methods for reporting errors, diverging from established standards leads to a lack of consistency that burdens API consumers with the need to adapt to different error handling mechanisms for every API they work with. The variability complicates the development of common error handling routines, making integrations more cumbersome and error prone.
Hiding Errors in Successful Responses: Sometimes, APIs mask errors within what appears to be a successful response, such as embedding error details within a 200 OK status. This practice will mislead consumers into believing a request was successful when it wasn't, complicating error detection and handling. It obfuscates the true nature of the interaction, leading to misinterpretations and flawed application logic.
Leaking Stack Trace Information: Error handling does not just impact the developer experience. Poor choices can also cause security issues. Some APIs inadvertently expose too much information in their error responses, such as detailed stack traces. While this might be done to aid debugging, it poses a significant security risk, offering potential attackers insights into the API's underlying implementation and structure. This information leakage can be exploited to create more targeted attacks, putting the entire application at risk.
Every API Responding to Errors Differently: The lack of a unified approach to error reporting means that every API might choose to communicate errors differently, in terms of both structure and content. This inconsistency is one of the biggest challenges for API consumers, especially when integrating multiple APIs into a single application. It necessitates bespoke error handling for each API, inflating the complexity and maintenance overhead of applications. As you scale to deliver more APIs, include how you deal with errors should be part of your API governance remit.

The Impact of Poor API Error Handling

The anti-patterns laid out above lead to a host of issues extending beyond mere inconveniences and can impact the success of an API.

Increased Development Time and Costs: Developers spend excessive time deciphering errors and implementing custom handlers for each API before they are confident of the work is done. Increasing meantime to integration increases overall costs associated with ingesting an API as well as on-going maintenance.
Poor Developer Experience: The lack of clear, consistent error responses frustrates developers, potentially deterring them from using the API.
Security Vulnerabilities: Leaking implementation details through errors can expose APIs to security breaches. It’s not an if, but a when this will happen!
Integration Complexity: Varied error reporting standards increase the complexity and fragility of integrations. This might well lead to consumer churn where a more stable API will be chosen instead.

What is Problem Details for HTTP APIs?

Initially introduced by RFC 7807, and recently further refined in RFC 9457, “Problem Details for HTTP APIs” is a standard providing a blueprint for expressing error details in a structured, consistent, and machine-readable format. It’s designed to make error response more informative and actionable, not just for human developers, but for the systems that consume APIs at runtime. With the release of RFC 9457, the standard has been enhanced to facilitate the inclusion of even more context and metadata to aid in diagnosing and resolving issues, as well as an IANA registry [3] for hosting of common problem type URIs.

The Problem Details object structure encapsulates error information in a way that can be universally understood across different systems and technologies.

type: A URI reference that identifies the problem type. It's intended to provide human operators with a place to find more information about the error. If not present or applicable, it’s assumed to be “about:blank”.
status: The HTTP status code generated by the origin server for this occurrence of the problem.
title: A short, human-readable summary of the problem type. It should not change from occurrence to occurrence of the problem, except for purposes of localization.
detail: A human-readable explanation specific to this occurrence of the problem. Unlike the title, this field's content can vary by occurrence.
instance: A URI reference that identifies the specific occurrence of the problem. It may or may not yield further information if dereferenced.
Extensions: Any field may be added to give additional information or context to consuming clients. Using extensions would be recommended over asking a client to parse the `detail` property. It’s also recommended to employ a must ignore pattern here with respect to how clients should consume the information, thus they should be expected to ignore any additional field not explicitly supported.

Here's an example Problem Details response [4] which contains two extensions properties (code and an errors array).

{
    "type":"https://problems-registry.smartbear.com/missing-body-property",
    "status":400,
    "title":"Missing body property",
    "detail":"The request is missing an expected body property.",
    "code":"400-09",
    "instance":"/logs/regisrations/d24b2953-ce05-488e-bf31-67de50d3d085",
    "errors":[
       {
          "detail":"The body property {name} is required",
          "pointer":"/name"
       }
    ]
 }

Check out part II: Problem Details (RFC 9457): Getting Hands-On with API Error Handling

Conclusion

Problem Details serves as an important mechanism for effective API error communication. By addressing the common anti-patterns that have long plagued HTTP APIs, the standard paves the way for more reliable, secure, and user-friendly API ecosystem. As we move forward, the importance of adopting such standardized practices will only continue to grow, especially in our increasingly interconnected world.

Ref	Description	URL
[1]	RFC 7807 - Problem Details for HTTP APIs	https://tools.ietf.org/html/rfc7807
[2]	RFC 9457 - Updates to RFC 7807	https://www.rfc-editor.org/info/rfc9457
[3]	IANA Registry of Problem Types	https://www.iana.org/assignments/http-problem-types
[4]	Example Problem Details Response	https://problems-registry.smartbear.com/missing-body-property