Story-time: C++, bounds checking, performance, and compilers

3 minute read

My thoughts after reading Story-time: C++, bounds checking, performance, and compilers.

Summary

The author is Chandler Carruth, who works as a Distinguished Software Engineer at Google.

  1. Misconceptions about the Performance Overhead of Bounds Checking in the Past The author was initially skeptical, believing that bounds checking would severely degrade performance. This belief stemmed from past reports and simple experimental results. At the time, compilers could not optimize bounds checking effectively, leading him to think enabling it by default was unrealistic. However, recent results have shown that the performance overhead of bounds checking is much lower than expected. He emphasizes that bounds checking can be run across all standard library types with an overhead of only around 0.3%.

  2. Shift in the Necessity of Bounds Checking Because bounds checking was believed to be costly, efforts to implement it were historically lacking. Today, it has been re-evaluated as an essential component for security/safety. His stance has shifted toward advocating that bounds checking be enabled by default in all code to reduce the security risks and potential errors that arise in its absence.

  3. Advancements in Compilers and Bounds Checking Optimization He points out that the initial performance problems of bounds checking were due to compilers not being designed to optimize it. Since C and C++ were designed based on unsafe abstractions, they lacked the infrastructure to handle bounds checking. However, recent compilers like LLVM and GCC have evolved to optimize bounds checking effectively, opening up the possibility of making bounds checking a default option that satisfies both safety and performance.

Questions Arising After Reading

Languages and Tools Supporting Bounds Checking

  • Language Level
    • Rust, Swift: Provide bounds checking by default, enforcing safety.
    • C, C++: Bounds checking is not provided by default, but can be implemented using additional tools or libraries.
  • Tools and Libraries
    • AddressSanitizer (ASan): Detects out-of-bounds access issues at runtime in C/C++.
    • UBSan (Undefined Behavior Sanitizer): Detects undefined behavior, including out-of-bounds access.

Spatial Safety vs. Temporal Safety

  • Spatial Safety
    • Preventing issues related to the boundary of memory access.
    • This concept ensures that a program only reads or writes data within its allocated memory regions.
    • That is, the goal is to prevent accessing locations beyond the allocated range when using arrays or pointers.
    • Key Problems
      • Buffer overflow, buffer underflow
    • Solutions
      • Bounds checking, memory protection tools (Address Sanitizer), safe languages
  • Temporal Safety
    • Temporal safety refers to preventing issues related to the lifetime of memory access.
    • This concept ensures that a program does not access invalid memory (memory that has already been freed or not yet allocated).
    • Key Problems
      • Dangling pointers, double free, use-after-free
    • Solutions
      • Using smart pointers, reference counting, dynamic analysis tools (Valgrind), garbage collection (GC).

What are PGO (Profile-Guided Optimization) and FDO (Feedback-Directed Optimization)?

PGO and FDO are methodologies for program optimization that share the common trait of utilizing runtime data to optimize code. However, differences exist depending on their application method and purpose.

  • PGO (Profile-Guided Optimization)
    • PGO collects a program’s execution profile (which contains performance data) beforehand and uses it to drive compiler optimizations.
    • The compiler identifies frequently executed sections (hotspots) and uses this data to maximize performance.
    • How it Works
      • Profile Data Collection: Run the program with various inputs and scenarios to collect profile data (execution frequencies, branch probabilities, etc.).
      • Profile Data Analysis: The compiler analyzes the collected data to differentiate between frequently executed code (hot code) and rarely executed code (cold code).
      • Optimizing Compilation: Hotspots are optimized for performance, while cold code is optimized to improve space efficiency.
    • Examples
      • -fprofile-generate and -fprofile-use flags in GCC and LLVM/Clang.
  • FDO (Feedback-Directed Optimization)
    • FDO is similar to PGO, but it includes methods to collect runtime data and perform optimizations in real time. The capability for dynamic runtime optimization characterizes FDO.
    • How it Works
      • Runtime Data Collection: The compiler or runtime system collects data during execution (e.g., branch mispredictions, cache miss rates).
      • Dynamic Optimization: The runtime system analyzes the data on the fly and applies suitable optimizations. Statically compiled code can be re-optimized while running.
    • Examples
      • Java JIT (Just-In-Time) compilers, some LLVM runtime optimization techniques.

Tags:

Categories:

Updated:

Leave a comment