Fairness is an essential requirement of any operating system scheduler. Unfortunately, existing fair scheduling algorithms are either inaccurate or inefficient and non-scalable for multiprocessors. This problem is becoming increasingly severe as the hardware industry continues to produce larger scale multi-core processors. This paper presents Distributed Weighted Round-Robin (DWRR), a new scheduling algorithm that solves this problem. With distributed thread queues and small additional overhead to the underlying scheduler, DWRR achieves high efficiency and scalability. Besides conventional priorities, DWRR enables users to specify weights to threads and achieve accurate proportional CPU sharing with constant error bounds. DWRR operates in concert with existing scheduler policies targeting other system attributes, such as latency and throughput. As a result, it provides a practical solution for various production OSes. To demonstrate the versatility of DWRR, we have implemented it in Linux kernels 2.6.22.15 and 2.6.24, which represent two vastly different scheduler designs. Our evaluation shows that DWRR achieves accurate proportional fairness and high performance for a diverse set of workloads.