False sharing

1 minute read

Overview

๊ฑฐ์ง“ ๊ณต์œ ๋Š” ์บ์‹ฑ ๋ฉ”์ปค๋‹ˆ์ฆ˜์— ์˜ํ•ด ๊ด€๋ฆฌ๋˜๋Š” ๊ฐ€์žฅ ์ž‘์€ ๋ฆฌ์†Œ์Šค ๋ธ”๋ก ํฌ๊ธฐ์˜ ๋ถ„์‚ฐ๋˜๊ณ  ์ผ๊ด€๋œ ์บ์‹œ๊ฐ€ ์žˆ๋Š” ์‹œ์Šคํ…œ์—์„œ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ๋Š” ์„ฑ๋Šฅ ์ €ํ•˜ ์‚ฌ์šฉ ํŒจํ„ด์ด๋‹ค.

  • ๋‘ ํ”„๋กœ์„ธ์„œ๋“ค์ด ๊ฐ๊ธฐ ๋‹ค๋ฅธ ๋‹ค๋ฅธ ์ฃผ์†Œ์— write๋ฅผ ํ•˜๋ ค๊ณ  ํ•˜๋‚˜, ์ด ์ฃผ์†Œ๋“ค์ด ๊ฐ™์€ ์บ์‹œ ๋ผ์ธ์— ๋งคํ•‘๋œ ์กฐ๊ฑด์„ ๋งํ•œ๋‹ค.
  • ํ”„๋กœ์„ธ์„œ๋“ค์˜ ์บ์‹œ ์‚ฌ์ด์—์„œ ์บ์‹œ ๋ผ์ธ์„ ์„œ๋กœ ์“ฐ๋Š” ์ƒํ™ฉ์ด ๋ฐœ์ƒํ•˜๊ฒŒ ๋˜๋ฉด, cache coherence protocol์œผ๋กœ ์ธํ•ด ์ƒ๋‹นํ•œ ์–‘์˜ ํ†ต์‹ ์„ ๋ฐœ์ƒ์‹œํ‚จ๋‹ค.

Example

#include <cstdio>
#include <chrono>
#include <pthread.h>

constexpr size_t
#if defined(__cpp_lib_hardware_interference_size)
  CACHE_LINE_SIZE = hardware_destructive_interference_size,
#else
  CACHE_LINE_SIZE = 64,
#endif
  MAX_THREADS = 8, MANY_ITERATIONS = 1000000000;

void* worker(void* arg) {
  volatile int* counter = (int*)arg;
  for (int i = 0; i < MANY_ITERATIONS; i++) (*counter)++;
  return NULL;
}
void test1(int num_threads) {
  auto begin = std::chrono::high_resolution_clock::now();

  pthread_t threads[MAX_THREADS];
  int counter[MAX_THREADS];

  for (int i = 0; i < num_threads; i++)
    pthread_create(&threads[i], NULL, &worker, &counter[i]);
  for (int i = 0; i < num_threads; i++)
    pthread_join(threads[i], NULL);

  auto end = std::chrono::high_resolution_clock::now();
  auto elapsed =
      std::chrono::duration_cast<std::chrono::nanoseconds>(end - begin);
  printf("Time measured: %.3f seconds.\n", elapsed.count() * 1e-9);
}

struct padded_t
{
  int counter;
  char padding[CACHE_LINE_SIZE - sizeof(int)];
};
void test2(int num_threads) {
  auto begin = std::chrono::high_resolution_clock::now();

  pthread_t threads[MAX_THREADS];
  padded_t counter[MAX_THREADS];

  for (int i = 0; i < num_threads; i++)
    pthread_create(&threads[i], NULL, &worker, &(counter[i].counter));
  for (int i = 0; i < num_threads; i++)
    pthread_join(threads[i], NULL);

  auto end = std::chrono::high_resolution_clock::now();
  auto elapsed =
      std::chrono::duration_cast<std::chrono::nanoseconds>(end - begin);
  printf("Time measured: %.3f seconds.\n", elapsed.count() * 1e-9);
}

int main()
{
    test1(8);
    test2(8);
}

์œ„ ์ฝ”๋“œ๋ฅผ ์‹คํ–‰ํ–ˆ์„ ๋•Œ, ์•„๋ž˜์™€ ๊ฐ™์€ ๊ฒฐ๊ณผ๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ๋‹ค.

Time measured: 2.946 seconds.
Time measured: 2.533 seconds.

์ฐธ๊ณ ์ž๋ฃŒ

False sharing Lecture 10: Cache Coherence

Leave a comment