We explore the implementation tradeoffs associated with FlexTM’s versioning and conflict detection mechanisms. Our results demonstrate that FlexTM exhibits 5× speedup over high-quality software TMs, and 1.8× speedup over hybrid TMs (those with software always in the loop), with no loss in policy flexibility. We find that the distributed commit protocol improves performance by 2 % –14 % over an aggressive centralized arbiter mechanism that also allows parallel commits. Finally, we compare the use of an aggressive hardware controller (as used in the base FlexTM design) to manage and to access any speculative transaction state overflowed from the cache, to a hardware–software approach dubbed FlexTM-S (FlexTM-Streamlined), where software manages the overflow region but uses a metadata cache to accelerate speculative data replacements and their subsequent accesses. We demonstrate that FlexTM-S’s performance is within 10 % of FlexTM’s despite its substantially simpler virtualization mechanism.